A study of Thompson Sampling with Parameter h

نویسنده

Qiang Ha

چکیده

Thompson Sampling algorithm is a well known Bayesian algorithm for solving stochastic multi-armed bandit. At each time step the algorithm chooses each arm with probability proportional to it being the current best arm. We modify the strategy by introducing a paramter h which alters the importance of the probability of an arm being the current best arm. We show that the optimality of Thompson sampling is robust to this perturbation within a range of parameter values for two arm bandits.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parameter Identifiability Issues in a‎ ‎Latent Ma‎- ‎rkov Model for Misclassified Binary Responses

Medical researchers may be interested in disease processes‎ ‎that are not‎ ‎directly observable‎. ‎Imperfect diagnostic‎ ‎tests may be used repeatedly to monitor the‎ ‎condition of a patient in the absence of a gold standard.‎ ‎We consider parameter identifiability and estimability‎ ‎in a Markov model for alternating binary longitudinal ‎responses that may be misclassified.‎ ‎Exactly ...

متن کامل

Horvitz-Thompson estimator of population mean under inverse sampling designs

Inverse sampling design is generally considered to be appropriate technique when the population is divided into two subpopulations, one of which contains only few units. In this paper, we derive the Horvitz-Thompson estimator for the population mean under inverse sampling designs, where subpopulation sizes are known. We then introduce an alternative unbiased estimator, corresponding to post-st...

متن کامل

Prognostic Factors Affecting the Results of Modified Thompson Quadricepsplasty for the Treatment of Extension Contracture of the Knee

Background: Knee extension contracture is a disabling complication after fractures around the knee. In this study we aimed to study factors influencing the outcomes of quadricepsplasty for the treatment of traumatic knee extension contracture. We hypothesized that there is no factor influencing the final range of knee motion.Methods: In this retrospective study, we included 64 patients who unde...

متن کامل

Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits

We study a variant of the multi-armed bandit problem with multiple plays in which the user wishes to sample the m out of k arms with the highest expected rewards, but at any given time can only sample ` ≤ m arms. When ` = m, Thompson sampling was recently shown to be asymptotically efficient. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our new setting whe...

متن کامل

Experimental and Theoretical Study of Thompson Seedless Grapes Drying using Solar Evacuated Tube Collector with Force Convection Method

An evacuated tube solar collector drier is designed and developed to study analytically and experimentally drying kinetics of Thompson seedless grapes in Pune, India. Drying experiments are carried out in the month of April- June for continuous three years from 2013-2015. During the experimentation, temperatures of hot and cold air at various places, ambient relative humidity and humidity varia...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1710.02174 شماره

صفحات -

تاریخ انتشار 2017

A study of Thompson Sampling with Parameter h

نویسنده

چکیده

منابع مشابه

Parameter Identifiability Issues in a‎ ‎Latent Ma‎- ‎rkov Model for Misclassified Binary Responses

Horvitz-Thompson estimator of population mean under inverse sampling designs

Prognostic Factors Affecting the Results of Modified Thompson Quadricepsplasty for the Treatment of Extension Contracture of the Knee

Asymptotically Optimal Algorithms for Budgeted Multiple Play Bandits

Experimental and Theoretical Study of Thompson Seedless Grapes Drying using Solar Evacuated Tube Collector with Force Convection Method

عنوان ژورنال:

اشتراک گذاری